# Vietnamese speech recognition
Whisper Small Vi
MIT
An automatic speech recognition model fine-tuned on Vietnamese speech data based on openai/whisper-small, improving Vietnamese transcription accuracy and robustness
Speech Recognition
Transformers Other

W
namphungdn134
334
2
Whisper Base Vi
MIT
A speech recognition model fine-tuned on 100 hours of Vietnamese speech data based on openai/whisper-base model, improving Vietnamese transcription accuracy
Speech Recognition
Transformers Other

W
namphungdn134
215
3
Chunkformer Large Vie
A large-scale Vietnamese automatic speech recognition model based on the ChunkFormer architecture, fine-tuned on approximately 3000 hours of publicly available Vietnamese speech data, with excellent performance.
Speech Recognition
PyTorch Other
C
khanhld
1,765
12
Vi Whisper Large V3 Turbo V1
Whisper-V3-Turbo model optimized for Vietnamese automatic speech recognition (ASR) tasks, fine-tuned using multiple Vietnamese datasets
Speech Recognition
Transformers Other

V
suzii
182
7
Viwhisper Medium
MIT
Whisper-medium model optimized for Vietnamese speech recognition tasks, fine-tuned on 1308 hours of Vietnamese data
Speech Recognition
Transformers Other

V
NhutP
139
4
Whisper Tiny Vi
Apache-2.0
Vietnamese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper-tiny architecture, demonstrating excellent performance on multiple Vietnamese datasets
Speech Recognition
Transformers Other

W
doof-ferb
44
2
Phowhisper Medium
Bsd-3-clause
PhoWhisper is a series of models designed specifically for Vietnamese automatic speech recognition (ASR). It achieves high robustness by fine-tuning the Whisper model on an 844-hour Vietnamese accent dataset.
Speech Recognition
Transformers Other

P
vinai
2,999
10
Phowhisper Small
Bsd-3-clause
PhoWhisper is a system specifically designed for Vietnamese automatic speech recognition, fine-tuned based on the Whisper model, supporting various Vietnamese accents.
Speech Recognition
Transformers Other

P
vinai
2,725
8
Wav2vec2 Bartpho
This is an automatic speech recognition model supporting Vietnamese, capable of outputting normalized text, timestamp labeling, and multi-speaker segmentation.
Speech Recognition
Transformers Other

W
nguyenvulebinh
472
6
Whisper Large V2 Vietnamese
Apache-2.0
This model is an automatic speech recognition (ASR) model based on OpenAI's Whisper Small architecture, fine-tuned on the Common Voice 11.0 Vietnamese dataset
Speech Recognition
Transformers Other

W
DrishtiSharma
25
2
Wav2vec2 Large Vi Vlsp2020
Vietnamese automatic speech recognition model based on wav2vec2 architecture, pre-trained with 13,000 hours of unlabeled YouTube audio and fine-tuned on 250 hours of labeled data
Speech Recognition
Transformers Other

W
nguyenvulebinh
385
4
Wav2vec2 Base Vietnamese 160h
Vietnamese speech recognition model based on Wav2vec2, fine-tuned on 160 hours of Vietnamese speech data
Speech Recognition
Transformers Other

W
khanhld
356
10
Viwav2vec2 Base 3k
This model is a Wav2Vec2 base model pre-trained on 3,000 hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, and requires fine-tuning on downstream tasks for use.
Speech Recognition
Transformers Other

V
dragonSwing
41
2
Viwav2vec2 Base 1.5k
This model is pretrained on 1.5k hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, requires fine-tuning before use.
Speech Recognition
Transformers Other

V
dragonSwing
38
0
Wav2vec NCKH 2022
Vietnamese automatic speech recognition model based on Wav2vec2 architecture, supporting audio-to-text conversion
Speech Recognition
Transformers Other

W
hoangbinhmta99
29
0
Wav2vec2 Large Xls R 300m Vietnamese Colab
Apache-2.0
This model is a Vietnamese speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers

W
Jungwonchang
22
0
Wav2vec2 Base Vietnamese
Apache-2.0
Vietnamese speech recognition model based on Wav2Vec2 architecture, fine-tuned on VSLP dataset, supports 16kHz sampled speech input
Speech Recognition
Transformers Other

W
dragonSwing
16
2
Fb Youtube Vi Large
Apache-2.0
This model is an automatic speech recognition model fine-tuned on Vietnamese YouTube informal audio datasets, based on facebook/wav2vec2-large-xlsr-53.
Speech Recognition
Transformers

F
phongdtd
31
1
Wavlm VLSP Vi
A Vietnamese automatic speech recognition model fine-tuned on the PHONGDTD/VINDATAVLSP - NA dataset based on microsoft/wavlm-base-plus
Speech Recognition
Transformers

W
phongdtd
21
0
Wavlm Vindata Demo Dist
An automatic speech recognition model fine-tuned on Vietnamese datasets based on microsoft/wavlm-base
Speech Recognition
Transformers

W
phongdtd
17
0
Wav2vec2 Base Vn 270h
A speech recognition model fine-tuned with approximately 270 hours of Vietnamese annotated data, supporting Vietnamese automatic speech recognition tasks
Speech Recognition Other
W
dragonSwing
202
8
Xls Asr Vi 40h
Apache-2.0
This model is a speech recognition model fine-tuned on the Common Voice 7.0 Vietnamese dataset and private datasets based on facebook/wav2vec2-xls-r-300m.
Speech Recognition
Transformers Other

X
geninhu
14
0
Fb Vindata Vi Large
Apache-2.0
This model is a Vietnamese automatic speech recognition model fine-tuned on the PHONGDTD/VINDATAVLSP - NA dataset based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers

F
phongdtd
29
0
Xls Asr Vi 40h 1B
Apache-2.0
Vietnamese automatic speech recognition model fine-tuned on 40 hours of FPT Open Speech Dataset (FOSD) and Common Voice 7.0 dataset based on facebook/wav2vec2-xls-r-1b
Speech Recognition
Transformers Other

X
geninhu
23
0
Wav2vec2 Large Xlsr 53 Vietnamese
Apache-2.0
A Vietnamese automatic speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained using the Common Voice dataset.
Speech Recognition Other
W
anuragshas
279
2
Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese
Apache-2.0
This is a Vietnamese automatic speech recognition (ASR) repair model based on the MT5 architecture, fine-tuned for Vietnamese speech recognition tasks.
Speech Recognition Other
F
leduytan93
25
0
Wav2vec2 Large Xlsr 53 Vietnamese
Apache-2.0
A Vietnamese automatic speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampling rate audio input.
Speech Recognition
Transformers Other

W
not-tanh
22
4
Wav2vec2 Base Vietnamese 250h
Vietnamese automatic speech recognition model based on wav2vec 2.0 architecture, trained on 13,000 hours of unlabeled audio and 250 hours of labeled data
Speech Recognition
Transformers Other

W
nguyenvulebinh
6,868
39
Wav2vec2 Large Xlsr Vietnamese
Apache-2.0
This is a Vietnamese fine-tuned speech recognition model based on facebook/wav2vec2-large-xlsr-53, trained using the Common Voice and Infore_25h datasets.
Speech Recognition Other
W
CuongLD
37
1
Wav2vec2 Large Xlsr Vietnamese
Apache-2.0
Vietnamese automatic speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53
Speech Recognition Other
W
Nhut
22
0
Featured Recommended AI Models